A Modular Framework to Learn Seed Ontologies from Text

نویسندگان

  • Davide Eynard
  • Matteo Matteucci
  • Fabio Marfia
چکیده

Ontologies are the basic block of modern knowledge-based systems; however, the effort and expertise required to develop them often prevents their widespread adoption. In this chapter, the authors present a tool for the automatic discovery of basic ontologies—they call them seed ontologies—starting from a corpus of documents related to a specific domain of knowledge. These seed ontologies are not meant for direct use, but they can be used to bootstrap the knowledge acquisition process by providing a selection of relevant terms and fundamental relationships. The tool is modular and it allows the integration of different methods/strategies in the indexing of the corpus, selection of relevant terms, discovery of hierarchies, and other relationships among terms. Like any induction process, ontology learning from text is prone to errors, so the authors do not expect a 100% correct ontology; according to their evaluation the result is closer to 80%, but this should be enough for a domain expert to complete the work with limited effort and in a short time. DOI: 10.4018/978-1-4666-0188-8.ch002

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Use of Correspondence Analysis to Learn Seed Ontologies from Text

In the present work we show our approach to generate hierarchies of concepts in the form of ontologies starting from free text. This approach relies on the statistical model of Correspondence Analysis to analyze term occurrences in text, identify the main concepts it refers to, and retrieve semantic relationships between them. We present a tool which is able to apply different methods for the g...

متن کامل

Bootstrapping Biomedical Ontologies for Scientific Text using NELL

We describe an open information extraction system for biomedical text based on NELL (the Never-Ending Language Learner) (Carlson et al., 2010), a system designed for extraction from Web text. NELL uses a coupled semi-supervised bootstrapping approach to learn new facts from text, given an initial ontology and a small number of “seeds” for each ontology category. In contrast to previous applicat...

متن کامل

An Ontology-based Framework for Text Mining

Structuring of text document knowledge frequently appears either by ontologies and metadata or by automatic (un-)unsupervised text categorization. This paper describes our integrated framework OTTO (OnTology-based Text mining framewOrk). OTTO uses text mining to learn the target ontology from text documents and uses then the same target ontology in order to improve the effectiveness of both sup...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Implementing Modular Ontologies with Distributed Description Logics

In an earlier paper, we presented a logical framework for representing and reasoning with modular ontologies with a special focus on supporting localized reasoning and integrity in the face of changes. This framework while being based on a formal semantics, was not specific to a particular logic used to specify ontologies and links between modules. As a result, no system was provided that imple...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015